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ABSTRACT 

Current results from the Lyman-a forest assume that the primordial power spectrum 
of density perturbations follows a simple power-law form, with running. We present the 
first analysis of Lyman-a data to study the effect of relaxing this strong assumption 
on primordial and astrophysical constraints. We perform a large suite of numerical 
simulations, using them to calibrate a minimally parametric framework for describing 
the power spectrum. Combined with cross-validation, a statistical technique which 
prevents over-fitting of the data, this framework allows us to reconstruct the power 
spectrum shape without strong prior assumptions. We find no evidence for deviation 
from scale-invariance; our analysis also shows that current Lyman-a data do not have 
sufficient statistical power to robustly probe the shape of the power spectrum at these 
scales. In contrast, the ongoing Baryon Oscillation Sky Survey will be able to do so with 
high precision. Furthermore, this near-future data will be able to break degeneracies 
between the power spectrum shape and astrophysical parameters. 

Key words: cosmology: theory - methods: numerical - methods: statistical - galaxies: 
intergalactic medium 



1 INTRODUCTION 

The primordial power spectrum of density fluctuations un- 
derpins much of modern cosmology. On large scales, it has 
been measured with high precision by cosmic microwave 



background (CMB) experiments (e.g. Komatsu et al. 2011 
and references within). In order to improve our knowledge 
of its scale-dependence, we turn to smaller scales, and astro- 
physical measurements probing later epochs in the evolution 
of the Universe. In this paper, we shall examine constraints 
from the data set which has probed the smallest scales to 
date: the Lyman-a forest. 

The Lyman-a forest consists of a series of features in 
quasar spectra due to scattering of quasar photons with neu- 
tral hydrogen. Since hydrogen makes up most of the bary- 
onic density of the Universe, the Lyman-a forest traces the 
intergalactic medium (IGM), and thus the baryonic power 
spectrum, on scales from a few up to tens of Mpc. This makes 
it the only currently available probe of fluctuations at small 
scales in a regime when the corresponding density fluctua- 
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tions were still only mildly non-linear, thereby simplifying 
cosmological inferences. A number of authors have exam- 
ined the cosmological constraints from the Lyman-a forest 



in the past (Croft et al. |1998 


Theuns et al.||1998 |McDon- 


aid et al.||2000||Hui et al.|2001 


|Viel et al. ||2002 


Gnedin & 


Hamilton 2002; McDonald et al.|005b||Viel & Haehnelt 2006 


Lidz et al.|2006l, while Seljak et al. (2005, 2006 


1 examined 



constraints combined with other data sets. For a review of 
the physics of the IGM and its potential for cosmology, see 



Meiksin (2009) 



Previous analyses assumed that the primordial power 
spectrum on Lyman-a scales is described by a nearly scale- 
invariant power law - a strong prior - and proceeded with 
parameter estimation under this assumption. In contrast, 
in this work we attempt to constrain the shape and ampli- 
tude of the primordial power spectrum at these scales using 
minimal prior assumptions about its scale-dependence. 

In view of the observational effort dedicated to the 
Lyman-a forest, and its promise as a probe of the primordial 
power spectrum, in this work we shall explore the possibili- 
ties of going beyond parameter fitting. To give us insight into 
the underlying model for the power spectrum shape, which 
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parameter estimation by itself cannot do, our present appli- 
cation to Lyman-a data should therefore ideally assume full 
shape freedom throughout the analysis. As a nearly scale- 
invariant primordial power spectrum is a generic prediction 
of the simplest models of inflation, a minimally parametric 
reconstruction can be a powerful test of inflationary models. 
Lyman-a constrains the smallest cosmological scales; thus, 
it provides the longest lever-arm when combined with the 
statistical power and robustness of CMB data, yielding the 
best opportunity presently available to understand the over- 
all shape of the power spectrum. 

The main Lyman-a observable, the flux power spec- 
trum, does not have a simple algebraic relationship to the 
matter power spectrum. By z ~ 3, the absorbing struc- 
tures are weakly nonlinear, and are also affected by bary- 
onic physics. Hence, to establish the relationship between 
the primordial power spectrum and the flux power spec- 
trum, we must resort to hydrodynamical simulations. The 
initial conditions used in our simulations allow for consid- 
erable freedom in the shape of the primordial power spec- 
trum, and this allows us to recreate the Lyman-a forest 
resulting from generic power spectrum shapes. Using an en- 
semble of simulations which sample the parameter space re- 
quired to describe the flux power spectrum, we construct 
a likelihood function which can be used to perform min- 
imally parametric reconstruction of the primordial power 
spectrum, while simultaneously constraining parameters de- 
scribing IGM physics. 

A statistical technique called cross-validation (CV) is 
used to robustly reconstruct the primordial power spectrum 
and Markov Chain Monte Carlo (MCMC) techniques are 
used to obtain the final constraints. The statistical approach 



parallelslSealfon et al. ( 20051) and Verde & Peiris (2008), who 



applied the same method to data from the CMB and galaxy 
surveys. Peiris & Verde (20101 added the current Lyman- 



a forest data to the joint analysis with larger scale data, 
via the derived constraints on the small-scale matter power 



spectrum from McDonald et al. (005b). However, these lat- 



ter constraints were derived assuming a tight prior on the 
shape of the primordial power spectrum at Lyman-a scales 
- an assumption which we drop in this work. In our anal- 
ysis we consider both the flux power spectrum determined 
by |McDonald et al.| ( |2006| ) from low-resolution quasar spec- 
tra obtained during the SDSS, and simulated data for the 



upcoming Baryon Oscillation Sky Survey (BOSS: |Schlegel 
|et al.|2009| >. 

This paper is organized as follows. In Section [2] we re- 
view the framework for power spectrum reconstruction and 
describe the details of the simulations and parameter esti- 
mation setup. Section [3] describes the data, and results are 
presented in Section [4] We conclude in Section [5] Techni- 
cal details of our calculations are relegated to Appendices [A] 
and|U 



2 METHODS 

In this section, we describe the statistical technique used 
in this paper, and how we built the likelihood function for 
minimally parametric reconstruction from Lyman-a data. 
Section [2. 1| describes the framework for power spectrum re- 
construction in general terms, while Section|2.2|gives further 



details of our specific implementation of this framework. Sec- 
tions 12.31 - 12.51 detail numerical methods used to extract a 
flux power spectrum from a given primordial power spec- 
trum. Finally, Section |2.6| describes the parameter estima- 
tion implementation. 

2.1 Power spectrum reconstruction 



Previous analyses of the Lyman-a forest (McDonald et al. 
|005b| |Viel et al.||2004| have assumed that the primordial 
power spectrum is a nearly scale-invariant power law, of the 
form 



P(k) = A s 



and then constrained A a 



- l + a s In k 



(1) 



and a a . In this work we will fol- 



low the same spirit as Sealfon et al. (20051; Verde & Peiris 



( |2008[ ), going beyond parameter estimation in an attempt to 
deduce what the Lyman-a forest data can tell us about the 
shape of the power spectrum under minimal prior assump- 
tions. A major challenge involved in all such reconstructions 
is to avoid over-fitting the data; it is of little use to produce 
a complex function that fits the data set extremely well if 
we are simply fitting statistical noise. Equally, an overly pre- 
scriptive function which is a poor fit to the data should be 
rejected. To achieve this balance, we add an extra term, jCp, 
to the likelihood function which penalizes superfluous fluc- 
tuations. Schematically, the likelihood function is: 



log£ = log£(Data|P(fc)) + Alog£ P . 



(2) 



where the form of Cp will be discussed shortly. Equation [2] 
now contains an extra free parameter which measures the 
magnitude of the smoothing required; the penalty weight A. 
As A — > oo the likelihood will implement linear regression. 
For particularly clean data, carrying no evidence for any 
feature in the P(k), A should be large. Data carrying strong 
evidence for P(k) features would be best analysed with a 
small value of A. We need a method of determining, from 
the data, the optimal penalty weight. Our chosen technique 
is called CV ( Green fc Silverman|1994 1, which quantifies the 
idea that a correct reconstruction of the underlying informa- 
tion should accurately predict new, independent data. 

The variant of CV used in this paper splits the data into 
three sets. The function is reconstructed using two of these 
sets (training sets). The likelihood (excluding the penalty 
term) of this reconstruction, given only the data in the re- 
maining set (validation set), is calculated. This step is called 
validation; because the data in each set are assumed to be 
independent, we now have a measure of the predictivity of 
the reconstruction. Validation is repeated using each set in 
turn and the total CV score is the sum of all three validation 
likelihoods. The optimal penalty is the one which maximizes 
the CV score. 

More generally, CV splits the data into k independent 
sets, with 2 < k ^ n, where n is the number of data points. 
k — 1 sets are used for training, and the remaining set for 
validation. Larger k allows for a bigger training set, and thus 
better estimation of the function to be validated against, 
but for most practical problems large k is computationally 
intractable. We have chosen k = 3 as a compromise. We 
verified that using k = 2, following Verde & Peiris (20081, 
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made a negligible difference to our results despite the smaller 
training set size. 

CV assumes that each set is uncorrelated; a mild vio- 
lation of this assumption will lead to an underestimation of 
errors, but not a systematic bias in the derived parameters 
Carmack et al. (20091. Our data include a full covariance 



Knot Position 
(Mpc" 1 ) 



P(k) (/10- 9 ) 

Minimum Maximum 



matrix, and so we are able to verify that correlations be- 
tween the sets are weak. 

The minimally parametric framework applied in this 



paper follows that of Sealfon et al. (20051; Verde & Peiris 



(2008) and |Peiris fc Verde| ( |2010[ ). It uses cubic splines to re- 
construct a function f(x) from measurements at a series of 
points, Xi, called the knots. The function value between each 
pair of knots is interpolated using a piecewise cubic polyno- 
mial. The spline is fully specified by the knots, continuity 
of the first and second derivatives, and boundary conditions 
on the second derivatives at the exterior knots (the knots at 
either end of the spline). The splines have vanishing second 
derivative at the exterior knots. If the power spectrum is 
given by smoothed splines, the form of the likelihood func- 
tion given above is 



log£ = Iog£(Data|P(fc)) + A J dlnk(P"(k)f 

d 2 P 

where P"(k) 



(3) 



d(lnfc) 2 



2.2 Knot placement 



The number and placement of the knots is chosen initially 
and kept fixed throughout the analysis. Once there are suf- 
ficient knots to allow a good fit to the data, adding more 
will not alter the shape of the reconstructed function sig- 
nificantly. In choosing the number of knots, we seek to find 
a balance between allowing sufficient freedom in the power 
spectrum, and having few enough parameters that the data 
are still able to provide meaningful constraints on two sets 
out of three when subdividing the data into the training 
and validation sets, as described above. Available comput- 
ing resources limit us in any case to considering only a few 
knots. We fit the primordial power spectrum with a four- 
knot spline for the Lyman-a forest fc-range. The flux power 
spectrum is available in twelve fc-bins, so there are three bins 
per knot, which should allow sufficient freedom. By compar- 
ison, |Peiris fc Verde] (2010) used seven knots to cover the k- 
range spanned by CMB, galaxy surveys and Lyman-a data, 
with a single knot for the Lyman-a forest. 

The SDSS flux power spectrum covers the range of 
scales, in velocity units, of k v — 1.41 x 10 -3 — 0.018 s/km. 
Dividing by a factor of H(z)/(1 + z) converts to comoving 
distance coordinates, so the constraints on the matter power 
spectrum are on scales of roughly k — 0.4 — 3 Mpc -1 . In this 
range of scales we place four knots (A-D, from large to small 
scales) evenly in log space. Numerical details of the knots 
are shown in Table [l] The maximum and minimum values 
of P(k) given there for each knot are simply the extremal 
values covered by our simulations. Simulation coverage of 
P(k) has been expanded where necessary to fully cover the 
range allowed by the data. 

We must specify the primordial power spectrum on 
scales well outside the range probed by data, even though 
they have no effect on the Lyman-a forest. This is for two 



A 


0.475 


0.83 


3.25 


13 


0.75 


0.60 


3.23 


C 


1.19 


0.60 


3.67 


D 


1.89 


0.53 


4.16 



Table 1. Positions of the knots. The maximum and minimum 
values of P(k) are the extremal values covered by our simulations. 
Fixed knots are not shown, but are discussed in the text. 



reasons. The first is that when running a simulation we must 
have a well-defined way to perturb the initial particle grid 
for all scales included in the simulation. In order to ensure 
that the scales on which we have data are properly resolved, 
we also need to simulate larger and smaller scales , and these 
require a defined power spectrum. The second reason is that 
our interpolation scheme works best when the perturbations 
induced by altering one of the knots are reasonably local. 
Adding extra end knots helps to prevent large secondary 
boundary effects, which would make interpolation far more 
difficult. 

For numerical stability reasons, we would like the am- 
plitude of fluctuations on these scales to be reasonably con- 
stant, but do not wish to make strong assumptions about the 
amplitude of the power spectrum there. Therefore we add 
two "follower" knots at each end of the spline. The ampli- 
tude is fixed to follow the nearest parameter knot, assuming 
that between follower and followed, the shape is a power 
law with n s = 0.97. Q The two follower knots are at scales 
of k = (0.15, 4) Mpc -1 . 

We also add a few knots, even more distant from the 
scales probed by the data, with completely fixed ampli- 
tudes consistent with the WMAP best-fitting power spec- 
trum. The amplitude of the primordial power spectrum 
on these scales does not significantly affect results; we 
have added knots here so that the initial density field is 
well-defined on a larger range of scales than probed by 
the simulation. This allows us to avoid any boundary ef- 
fects associated with the ends of the spline. These fixed 
knots are at k = (0.07, 25, 40)Mpc~ 1 , with amplitudes of 
(2.43, 2.03, 2.01) x 10~ 9 . Fig. fl] shows the effect of altering 
the amplitude of the D knot on the flux power spectrum. 

2.3 Simulations 

In this study, full hydrodynamical simulations were run us- 



ing the parallel TreePM code GADGET- 2 ( jSprin gel 2005). 
GADGET computes long-range gravitational forces using a 
particle grid, while the short-range physics are calculated 
using a particle tree. Hydrodynamical effects are included 
by having two separate particle types; dark matter, affected 
only by gravity, and baryons, modelled using smoothed par- 
ticle hydrodynamics (SPH), where particles are supposed to 
approximate density elements in the matter fluid. The rest 



1 Hence, if the amplitude of the power spectrum at the D-knot is 
P D , the power spectrum at the follower knot has the amplitude 
P D (k D /k D+1 ) 03 , where k D is the position of the D knot and 
fcD+i j- ne p OS ition of the follower. 
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Figure 1. Effect on the flux power spectrum of varying the D 
knot at z = 3. On a scale where the best-fitting amplitude is 0.9, 
the amplitudes of the D knot are, from the lowest line upwards, 
0.5, 0.7, 1.1, 1.3 and 1.7. Non-linear growth tends to erase depen- 
dence on the initial conditions, so the effect is smaller at lower 
redshifts. 



generated using N-GenlCs, modified to specify the primor- 
dial power spectrum by a spline, and use separate transfer 
functions for baryons and dark matter, as calculated using 
CAMB ( |Lewis et al.pOOO l. 

For knots B and C, we used the above fiducial param- 
eters for box size and particle resolution. For the D knot, 
we slightly compromised on box size in favour of particle 
resolution, and used simulations of (48, 400), since we found 
that the D knot had a negligible affect on the largest scales. 
To fully capture the behaviour of the A knot, we used larger 
simulations with (120, 400). We have used different sized 
simulations to ensure that for each knot, the characteristic 
scales representing it have very good numerical convergence; 
this issue is addressed in full in Appendix [B] Our ability to 
do this is one technical advantage of our approach compared 
with previous studies; if we were to alter the amplitude of 
the whole power spectrum, we would need to achieve conver- 
gence over all the relevant scales at once. In our approach, 
each simulation only needs strict convergence over the nar- 
row range of scales probed by a single knot. 



of this section gives technical details of our simulations and 
the included astrophysics, and may be skipped by the reader 
interested only in the cosmological implications. 

GADGET has been modified to compute the ionisation 
state of the gas using radiative cooling and ionisation physics 



as originally described by Katz et al. (19961, and used in 



Viel & Haehnelt (20061. Star formation is included via a 



simplified prescription which greatly increases the speed of 
the simulations, where all baryonic particles with overden- 
sity p/po > 10 3 and temperature T < 10 s K are immedi- 
ately made collisionless. Viel et al. (20041 compared simula- 



tions using this prescription with identical simulations using 
a multi-phase model, and found negligible difference in the 
Lyman-Qf statistics. Additionally, all feedback options have 
been disabled, and galactic winds neglected; | Bolton et al.| 



1 2008 ) found that winds have a small effect on the Lyman- 



ol forest. The gravitational softening length was set to 1/30 
of the mean linear inter-particle spacing. 

The gas is assumed to be ionised by an externally 
specified, spatially homogeneous UV background, based on 
the galaxy and quasar emission model of Haardt & Madau 
(20011. We follow previous analyses in assuming that the 



gas temperature is initially in equilibrium with the CMB, 
that the gas is in ionisation equilibrium, optically thin, and 
that we can neglect metals and evolution of elemental abun- 
dances. Lyman-a absorption arises largely from near mean- 
density hydrogen, which should undergo little chemical evo- 
lution, so using a simplified star formation criterion and ne- 
glecting metals is physically well-motivated. Assuming that 
the gas is optically thin and in ionisation equilibrium will 
break down during reionisation, but at the redshifts we are 
interested in, we can model the effect of non-instantaneous 
reionisation by increasing the photo-heating rate, as de- 



scribed in Viel & Haehnelt (2006) 



The fiducial simulation for this paper has a box size 
of 60 Mpc/i" 1 and 2 x 400 3 gas and dark matter particles, 
[which we will write as (60, 400) in future], and runs from 
z — 199 to z = 2. Snapshots are output at regular inter- 
vals between redshift 4.2 and 2.0. Initial conditions were 



2.4 IGM thermodynamics 

Constraints on the thermal history of the IGM are given in 
terms of the parameters of a polytropic temperature-density 
relation 



T = T 



7-1 



(4) 



where a given SPH particle has temperature T and density p. 
To and po are the average quantities for the whole simulation 
snapshot. To determine To and 7 from a simulation box, 
a least-squares fit is performed from low-density particles 
satisfying 



1.0 < log 



Po 



< 0. 



(5) 



Regions that are less dense than the lower limit above are 
ignored because they are poorly resolved in SPH simulations 
(Bolton & Becker 2009). The simplified star formation cri- 
terion means that many overdensities have been turned into 
stars, and their baryonic evolution not followed; hence they 
are also neglected. Both 7 and To are assumed to follow a 



power law broken at z — 3 by Hell reionisation ( Schaye et al 



2000 1, so that they are given by: 



d-y" 



T 



1 A [(1 + *)/4] 
1 A [(1 + *)/4] 
T A [(l + ^)/4] a ^ 
T A [(1 + z)/4] dT ° ifz>3. 



if z < 3, 
if z > 3. 

if z < 3, 



(6) 



When performing parameter estimation, we marginalize over 
7 A , T A and dr( S ' R , dTo' R . The different thermal histories 
were constructed by modifying the fiducial simulation's 
photo-heating rate as described in Section 2.2 of |Bolton et al.| 
( [2008 1. 

The effective optical depth is described by a power law, 
with parameters: 



r cS = r A ff [(1 + 2 )/4pf 



(8) 
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Previous studies (McDonald et al. 005b Viel & 



Hachnelt 2006) used the same transfer function for both 



dark matter and baryon particles; we have used different 
transfer functions for baryon and dark matter species. At our 
starting redshifts, the transfer functions for the baryons are 
about 10% lower than for the dark matter on these scales, 
because baryon fluctuations have not grown as fast during 
tight coupling. Once they have decoupled from the photons, 
the baryons fall into the potential wells of the dark mat- 
ter, and by z = 1, the linear transfer functions are almost 
identical. At redshifts 2 — 3, however, the effect is small but 
noticeable, and accounts for a 2% scale-independent drop in 
the power spectrum. This is too small to affect current data, 
but could be potentially important for analysing BOSS data. 

2.5 The flux power spectrum 

In the case of Lyman-a, the observable is not a direct mea- 
surement of the clustering properties of tracer objects, as 
in galaxy clustering, but the statistics of absorption along 
a number of quasar sightlines. Therefore we define the flux, 
J-, as 



T = exp(-r), 



(9) 



where r is the optical depth. We define the flux power spec- 
trum as 



p F (k) = \8 F {k)\ : 



(10) 



Here T is the mean flux. The tilde denotes a Fourier 
transformed quantity, where our Fourier conventions, used 
throughout, are: 



f(k) 



/(^ ,h dx. 



(11) 



To aid the eventual understanding of our results, we 
digress slightly here to review the physical effects of the var- 
ious thermal parameters on the flux power spectrum. The 
mean flux, essentially a measure of the average density of 
neutral hydrogen, has a large impact on the amplitude of 
the flux power spectrum. Cosmological information from the 
Lyman-a forest is obtained through examining the power 
spectrum shape and its redshift dependence. The effect of 
a higher temperature, as preferred by the flux power spec- 
trum, is to suppress power predominantly on small scales, as 
a higher temperature wipes out small-scale structure in the 
baryons. The exponent of the temperature-density relation, 
7, controls the temperature difference between voids and 
overdensities. A higher 7 makes voids cooler and overden- 
sities hotter. At high redshifts, where much of the Lyman- 
a absorption comes from voids, the effect of an increased 7 
is to decrease the temperature of the Lyman-a emitting re- 
gions, so there is relatively more small-scale structure. At 
low redshifts, however, most of the Lyman-a absorption 
comes from near mean density material, and so an increased 
7 increases the temperature, decreasing the amount of small- 
scale structure. For further details of the physical effects of 
the various parameters, see Section 4.2.1 a nd Fig. 3 of|Viel 



fc Haeh nelt (|2006|), as well as Fig. 11-13 of McDonald et al. 
005bf " 

Current constraints on Pf are given by[McDonald et al.| 



(20061, determined from ~ 3000 SDSS quasar spectra at 
2 = 2-4. 

Each simulation snapshot was processed to generate 
an averaged flux power spectrum as follows. First, 8000 
randomly placed simulated quasar sightlines were drawn 
through the simulation box. For a 60 Mpc/1 -1 box, this con- 
stitutes an average spacing between sightlines of 670 /i _1 kpc, 
corresponding to scales of roug hly k = 10 Mpc/i" 1 , far 
smaller than the scales probed by the Lyman-a forest. We 
verified that doubling the number of sightlines to 16000 
made a negligible difference to the resulting power spectra. 

When calculating absorption, particle peculiar veloci- 
ties were included, which increases the (non-rescaled) mag- 
nitude of the power spectrum by approximately 10%. 

To generate the flux power spectrum, the absorption 
due to each SPH particle near the sightline is calculated, 
giving us a number of simulated quasar spectra, which are 
smoothed with a simple boxcar average. Each spectrum is 
rescaled by a constant so that the mean flux across all spec- 
tra and absorption bins matches that observed by Kim et al.| 
(2007). This rescaling hides our ignorance of the amplitude 
of the photo-ionizing UV background. The mean over all 
the rescaled spectra is then used as the extracted flux power 
spectrum for the box. For further details of how we com- 
puted the absorption, see Appendix |A"| 

We follow previous work in not attempting to model 
continuum fitting errors. The Si III contamination found 
by |McDonald et al.] ( |2006[ ) is modelled by assuming a linear 
bias correction of the form P' F = [(1 + a 2 ) + 2acos(i>fc)] Pf, 
with a = / S ilii/(1 - T), /sail = 0.011, and v = 2271 km/s. 

Finally, since high-density, damped Lyman-a systems 
(DLAs) are not modelled by our simulations, we add a cor- 
rection to the flux power spectrum to account for them, of 
the form calculated by |McDonal d et al.| |005a). The am- 
plitude of this correction is a free parameter, and will be 
discussed further in Section [2.6.21 

We checked the convergence of our simulations with re- 
spect to box size and particle resolution. Here we give only 
a brief summary of the results; further details may be found 
in Appendix [B] For the highest redshift bins at z — 4.2, 4.0 
and 3.8, increasing the particle resolution had a large effect 
on the flux power spectrum. Achieving numerical conver- 
gence for the Lyman-a forest at high redshift is challenging, 
because most of the signal for the Lyman-a forest is coming 
from poorly resolved underdense regions. In addition, cur- 
rent data at high redshifts are much more noisy than at low 
redshifts, and future surveys will not probe these redshifts 



at all. Accordingly, we follow Viel & Haehnelt ( 2006 \ and 



do not use the three highest redshift bins in our analysis. 

At lower redshifts, and except in the smallest and 
largest fc-bins, the change with increased particle resolu- 
tion was small. On the smallest scales, however, there was 
a change of around 5% in each bin. This increase is sys- 
tematic, and so we correct for it as described in Appendix 
[B] The larger box increased power on the largest scales by 
around 5%, due to sample variance in the simulation box. 
The methodology we used to correct for this effect is again 
detailed in Appendix [B] 

The above figures were the dominant errors in our mod- 
elling of the flux power spectrum. Uncorrected modelling er- 
rors are therefore < 2% of the flux power spectrum in each 
bin, far below the current measurement error of ~ 12% in 
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each bin of the flux power spectrum, and on the order of 
the expected statistical errors for the BOSS survey, which 
are ~ 1.5%. A significant decrease in modelling errors would 
require the use of simulations with improved particle reso- 
lution, which are beyond the computational resources avail- 
able to us. 



2.6 Parameter estimation 

So far we have given a formula for the primordial power spec- 
trum, and described how we use it to extract a flux power 
spectrum to compare with observational data. In this sec- 
tion, we shall describe how we actually performed that com- 
parison. First we describe a scheme for robustly interpolat- 
ing the parameter space to obtain flux power spectra corre- 
sponding to parameter combinations which we have not sim- 



ulated, following Viel & Haehnelt (20061. Secondly, we de- 



scribe the parameters of the Monte Carlo Markov chains we 
used for parameter estimation. For more details of MCMC, 
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Figure 2. The difference between the flux power spectrum as 
obtained from interpolation, and directly by simulation. Here only 
the C and D knots have been changed from their initial values. 
Each line represents simulation output at a different redshift bin, 
between z = 2.0 and z = 4.2. The grey band shows 1% error bars. 



2.6.1 Parameter interpolation 

Directly calculating a flux power spectrum from a given set 
of primordial fluctuations requires a hydrodynamical simu- 
lation. This makes it impractical to directly calculate Pf for 
every possible set of input parameters. Instead, simulations 
are run for a representative sample and other results are 
obtained from these via interpolation. We assume that the 
flux power spectrum varies smoothly around the best-fitting 
model, parametrize this variation with a quadratic polyno- 
mial for each data point, and then check that this accurately 
predicts new points. If we have some simulation with a pa- 
rameter vector which differs from a 'best-guess' simulation 
by 8pi, the corresponding change in the flux power spectrum, 
SPf, is given by 



SP F = ^2 ctjSpj + Pj5p 



(12) 



The coefficients of this polynomial are constrained by per- 
forming a least-squares fit to flux power spectra generated 
by numerical simulations. We experimented with including 
cross-terms (of the form PiPj), but found that this did not 
significantly improve the accuracy of the interpolation. 

To estimate the interpolation coefficients, we used seven 
simulations for each of our four power spectrum parameters, 
one of which was used to test the accuracy of the interpola- 
tion. To check for correlation between parameters, we simu- 
lated varying two neighbouring knots at once. As the great- 
est effect of each knot on the flux power spectrum is over a 
localized range of scales, our interpolation errors should be 
maximal here. We needed only four simulations per thermal 
history parameter, and checked we could accurately predict 
SPf for a very different thermal history. As a final interpo- 
lation verification, we performed a simulation where all six 
parameters were changed simultaneously. Fig. [2] shows the 
interpolation errors for one of our tests, which are around 
1% of the total change for each bin. This is smaller than the 
expected statistical errors for BOSS, and was replicated by 
our other test simulations. 



2.6.2 MCMC methodology 

To perform parameter estimation, we use a version of the 



publicly available CosmoMC Lewis & Bridle ( 2002 1 code, 



with a modified likelihood function as described in Section 

i2~n 

We marginalize over four parameters for the four knots, 
with priors as specified in Table[l] and over eight parameters 
of the thermal history, as described in Section [2. 3[ 



We follow the advice of McDonald et al.J (|2006[) , and add 



a number of nuisance parameters to the SDSS data, all with 
Gaussian priors. To parametrize uncertainty in the resolu- 
tion of the spectra, we add a parameter a 2 with prior 0±49, 
and multiply the flux power spectrum by exp (— fc 2 a 2 ). The 
effect of an increased a 2 is therefore to damp power on the 
very smallest scales. Each redshift bin has one parameter, 
fi, to describe uncertainty in the subtraction of background 
noise, with a prior of 0±0.05. To marginalize over the uncer- 
tainty in the effect of DLAs, we add A^ amp , with a prior of 
1 ± 0.3. The effect of this correction is to increase the slope 
of the flux power spectrum. 

We also marginalize over residual uncertainties in the 
Hubble parameter, h and fijw, using flat priors of 0.2 < 
Om < 0.4, and 0.5 < h < 0.9. For the rest of our background 
cosmology, we assume parameters in agreement with those 
preferred by WMAP 7, including negligible gravitational 
waves and spatial curvature. The priors on h and Qm make 
a negligible difference to our results, because both these pa- 
rameters only weakly affect the Lyman-a forest. We assume 
To < 50000 K and < 7 < 5/3 on physical grounds; the 
temperature-density relation of the IGM cannot be steeper 
than the perfect gas law, and very high temperatures would 
contradict independent measurements of the IGM tempera- 



ture by Schaye et al. (20001 



2.6.3 Cross-validation methodology 

Cross-validation (CV) requires the splitting of the data set 
into n independent sets. For best results, these sets should 
be as uncorrelated as possible. We choose to use alternating 
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bins in k for each set. For data with n k bins, the first set 
would consist of bins 1, 4, 7 ... , the second bins 2, 5, 6 . . . 
and the third similarly. 

To calculate the CV score, we estimate the best-fit from 
the two training sets, using an MCMC. The CV score for the 
remaining, validation, set is the likelihood of this best-fit. 
The total CV score for a given penalty is the sum of the CV 
scores for each set. 



3 DATASETS 

3.1 Current data from SDSS 

The SDSS data used in this study consist of a best-fitting 
flux power spectrum in 12 fc-bins and 11 redshift bins, to- 
gether with a covariance matrix and a set of vectors de- 
scribing the foreground noise subtraction. It was analysed 
by |McDonald et ah] ( |2006[ ), and comes from 3000 quasar 
spectra. Of these, ~ 2000 are at redshift 2.2-3, and ~ 1000 
above that. We use the 8 redshift bins at z < 3.8 only. 

We have chosen not to include any additional small- 
scale information based on high-resolution quasar spectra. 
In principle, this can help break degeneracies and should be 
included in future analyses. Currently, however, systematic 
error from such data sets is hard to quantify, and the opti- 
mal method for extracting the thermal state of the IGM is 
not yet clear. Our focus in this work has been robustness, 
and so we have limited ourselves to a single data set, whose 
properties have been extensively studied and are relatively 
well- understood. 



3.2 Simulated data from BOSS 

In this section we will describe our simulated data for fore- 
casting constraints from BOSS, an ongoing -future survey 
which will acquire 1.6 x 10 s quasar spectra JSchlegel et al. 
20091, between z = 2.2 — 3.0. We need to simulate both a 



covariance matrix and a flux power spectrum. 

We have assumed that the noise per spectrum of the 
BOSS data will be approximately the same as they were for 
SDSS. This is a simple assumption, but broadly justified be- 



cause both surveys use similar instruments (Schlegel et al 



2009). Truly accurate modelling of the covariance matrix is 



impossible until the release of the final data, however, we 
expect our modelling of the BOSS covariance matrix to be 
completely adequate for a forecast. Our simulated BOSS co- 
variance matrix is simply the SDSS covariance matrix scaled 
to account for the increase in statistical power resulting from 
the much greater number of quasar sightlines. There are 
roughly 2000 quasar sightlines in the SDSS sample below 
z = 3, so the scale factor is 2000/160000 = 1/80. 

To generate the flux power spectrum, we used cos- 
mological parameters consistent with the best-fitting re- 
sults from WMAP 7, and thermal parameters consistent 
with theoretical expectations: 7 ~ 1.45 and To = 2.3 x 
10 3 [(1 + z)/4]°- 2 K. The effective optical depth was r = 
0.36[(l + 2 )/4] 3 ' 65 . The power spectrum amplitude was se- 
lected to match a spectrum with as = 0.8 and n s = 0.96. 

We then added uncorrelated Gaussian noise with a vari- 
ance given by the diagonal elements of the simulated BOSS 
covariance matrix. As BOSS will only take data at z ^ 3, we 



dispense with the thermal parameters for higher redshifts. 
The foreground noise properties of the BOSS data are ex- 
pected to be similar to those of the SDSS data; we therefore 
leave the priors on the parameters measuring uncertainty in 
the noise subtraction and the parameter measuring resolu- 
tion uncertainty, a 2 , unchanged. 

BOSS is also expected to determine the transverse flux 
power spectrum. Simulating the larger scales needed to 
properly model the effect of this is beyond the scope of this 
paper, and we refer the interested reader to |Slosar et al.| 



(20091; White et al. (20101 



4 RESULTS 



4.1 Current constraints 



Fig. [3] shows the CV-reconstruction of the primordial power 
spectrum from SDSS data. We have emulated confidence 
limits by plotting the envelope of samples which have a like- 
lihood in the top 68% and the top 95%. At the 95% level, 
the power spectrum is allowed to oscillate more within the 
allowed envelope, but the size of the overall constraint on 
the amplitude does not greatly change, as found by [Verde] 
& Peirisl d2008b. 



We have shown plots for two penalties: one high, one 
low. This was because we have been unable to determine an 
optimal penalty from current data; the CV score shows no 
significant variation, even when the penalty is having neg- 
ligible impact on the likelihood. We interpret this to mean 
that the shape constraints on the primordial power spectrum 
from current Lyman-a data are very weak. 

Previous analyses assumed a power-law prior for the 
shape of the primordial power spectrum, and constrained 
this slope and the overall normalisation from the same data 
used above. While such parameter estimation leads to tight 
constraints from the data (assuming the underlying shape 
prior is correct), relaxing this tight prior leads to the loss of 
ability to constrain the scale-dependent shape of the power 
spectrum. The current data can still be used as part of a 
minimally parametric primordial power spectrum if one ex- 
ploits the extended range in scales that can be probed in 
combination with other data sets ( |Peiris fc Verde|2010 |. 

The black error bar in Fig. [3] shows a comparison with 
|Viel et al.|p009| . Our method gives results for the ampli- 
tude of the primordial power spectrum at Lyman-a scales 
which are completely consistent with that work, but some- 
what weaker. This is to be expected; we are removing a tight 
prior on the shape of the power spectrum. For a very high 
penalty, i.e. the limit at which the implicit prior in our anal- 
ysis approaches a power-law spectrum, we can reproduce 
the error bars of Viel et al. ( 2009 1. We are also in agreement 



with the results of an earlier analysis of the Lyman-a forest, 
|McDonald et al.| ( |0l)5b"| ), which constrained <t 8 = 0.85L0.13. 

The corresponding constraints on n s from our recon- 
struction are extremely weak, especially for the low penalty: 
n s ~ 0.2 — 1.2. The constraints on n 3 in |Viel et al. (2009), 
in addition to the power-law prior, were greatly assisted by 
the fact that the pivot scale ko in Eq. [I] was chosen to be 
ko = 0.002 Mpc -1 ; a small change in the slope of the power 
spectrum at ko leads to a large change in power spectrum 
amplitude by k — 1 Mpc -1 . Here, we are trying to constrain 
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Figure 3. Constraints on the primordial power spectrum from SDSS data from CV, for low (left) and high (right) penalties. Black 
circles show the positions of the knots, with arbitrary normalisation. The light blue regions show the top 68% of likelihoods for SDSS 
data, while the dark blue regions show the 95% likelihood range. The black error bar shows the results of previous analyses | |Viel et al.| 
assuming a power-law power spectrum at k = 1 Mpc" 1 . The dashed lines show limits on the slope from that work. 
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Figure 4. Constraints on the primordial power spectrum from 
the penalty term alone, using the value in the "low penalty" plot 
of Fig.[3]above. Dashed lines show the power specrum range sam- 
pled by the simulations. 



a scale-dependent n s (k) — 1 + 4ttt using only the interval 
of scales sampled by Lyman-a forest. We find that, while the 
current Lyman-a data are able to constrain the amplitude 
of the power spectrum at these scales, they are not powerful 
enough on their own to significantly constrain the shape of 
the spectrum in a robust manner. At no penalty do we see 
any evidence in the current data against a scale-invariant 
power spectrum. 

We can explicitly demonstrate that the current Lyman- 
a data add little information to a weak prior on the shape 
of the power spectrum in the following way. Fig. [4] shows a 
minimally parametric reconstruction assuming the penalty 
designated "low" . These constraints were generated without 
using any data whatsoever, and are similar to those obtained 
with the Lyman-a forest data. This figure shows clearly that 
our SDSS constraints are affected by the prior even for the 
low penalty. Since the penalty is proportional to P"(k), it 
cannot determine the power spectrum amplitude. Instead, 
the allowed power spectrum amplitude is simply the minimal 
range probed by our simulations. 



The CV part of our method involves reconstructing the 
optimal penalty, and thus the strength of the shape prior jus- 
tified by the data. CV is essentially a method to reconstruct 
the most favoured prior correlation between knots; since the 
prior is reconstructed from the data, prior-driven constraints 
would not necessarily be a problem. However, here we are 
finding that no particular prior is favoured over any other. 
Thus, the width of the envelopes in Fig. [3] are actually arbi- 
trary and should not be used to draw conclusions about the 
amplitude of primordial fluctuations at Lyman-a scales. 

We performed a number of checks to determine the 
cause of our failure to find an optimal penalty. Changing 
our methodology for splitting the data into CV bins did not 
affect the results. A flux power spectrum simulated in the 
same way as our BOSS data, and using the same parameters, 
but with error bars of the same magnitude as the current 
data showed no preference for a particular penalty, despite, 
as we shall see, there being a well-defined optimal penalty 
for BOSS simulations. Fixing the thermal history parame- 
ters 7 and To to fiducial values was also sufficient to allow 
us to reconstruct a penalty. Therefore statistical error and 
systematic uncertainty in the thermal history are the sig- 
nificant factors preventing us from robustly reconstructing 
a minimally parametric power spectrum shape from current 
data. 

Constraints on the thermal history parameters are as 
follows. For the low penalty we found 0.8 < 7 < 1.7 at la 
(recall that this upper limit is imposed as a physical prior), 
while for the high penalty 0.2 < 7 < 1.7. The correspond- 
ing constraint from Viel et al. (2009) is 7 = 0.63 ± 0.5. 



There is a noticeable decrease in the best-fitting value of 7 
with an increased penalty (i.e. a stronger shape prior). We 
find it intriguing that we prefer an inverted temperature- 
density relation with 7 < 1.0 only for a high penalty, but 
the constraints are so weak that we cannot draw any solid 
conclusions from them. 

Constraints on the other parameters at la were sim- 
ilar for both penalties. Those for the low penalty were: 
50000 > T > 35000 K, r eff = 0.33 ± 0.03, a slope of 
T c s ff = 3.3T0.3, h = 0.7T0T5 and fi M = 0.25T0.04. Finally, 
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constraints on the noise parameters largely reproduce the 



than our prior from Kim et al. (20071, so this prior provides 



priors (listed in Sec 2.6.21. Our results mirror those of Viel 



et al. (20091; we have therefore verified that those results 



are not biased by a shape prior on the power spectrum. The 



constraints of this work and Viel & Haehnelt ( 2006 ) on the 



IGM temperature, To, prefer a larger central value than that 
obtained by McDonald et al. ( |005b[ ). However, IMcDonald 



et al. (005b) imposed a prior of T = 20000T2000A", derived 



from analysis of the flux probability distribution function of 
high-resolution quasar spectra, so a direct comparison is not 
possible. For further discussion of this intriguing result, we 



refer the interested reader to Section 5 of Viel et al. (2009) 



4.2 Simulated constraints from BOSS 

Unlike current data, our simulated BOSS data shows a well- 
defined maximum in the CV score. In Fig.[5]we show the con- 
straints using this optimal penalty, together with our input 
power spectrum. The input data is reconstructed very well, 
within an envelope of roughly 0.4 x 10 -9 ; a precision com- 
parable to that of a CV reconstruction from WMAP data 



(Verde & Peiris 20081. Even though our simulated power 



spectrum is nearly scale-invariant, we do not recover a very 
high optimal penalty. This is a feature of our approach; un- 
less the data is noiseless, not all oscillations in the power 
spectrum will be ruled out, and the optimal penalty is one 
which allows for them while being consistent with experi- 
mental noise. 

Our method was designed to extract P(k), and so the 
penalty may not be entirely optimal for the derivative. Even 
given this, our constraints of 0.7 < n s < 1.2 are still com- 
paratively weak. However, even this constraint could be use- 
ful to test for potential systematics, or in combination with 
other data sets. One other important data set will be the 



power spectrum of the cross-correlation of the flux (Viel 
et al.|2002||McDonald fc Eisenstein|2007||Slosar et al.|2009| ), 



which BOSS is expected to measure for the first time. Es- 
timating the power of combined constraints is beyond the 
scope of this paper, but it could be considerable. 

Fig. [6] shows the thermal parameters as reconstructed 
from BOSS data. We have correctly reconstructed our input, 
as marked by the black dots. The reconstructed h and Qm 
were also consistent with their input values; Qm = 0.27 ± 
0.02 (input: 0.267), h = 0.74 ±0.05 (input: 0.72). 

Marginalized constraints on the thermal and noise pa- 
rameters are almost a factor of two better for BOSS than for 
current data. We have assessed the impact that further in- 
formation about the thermal history of the IGM would have 
on cosmological constraints, imposing priors corresponding 
to present and reasonable near-future measurements: 



r eff =0.36 ±0.11, 
To = 23000 ± 3000K , 



r e g = 3.65 ± 0.25 , 
7 = 1.45 ±0.2. 



Constraints on the mean optical depth are from Kim et al. 



Becker 



(20071. For the temperature of the IGM, we follow 
et al.| ( [2"011[ ) and assume a future IGM study has determined 
7 to the required precision. 

The effect of the primordial power spectrum evolves 
with redshift in a different way to To and 7. Hence, suffi- 
ciently accurate data can break degeneracies between them. 
For T e ff, the constraints from BOSS are already much tighter 



no additional information. Overall, therefore, the extra in- 
formation provided by our thermal priors has no significant 
effect on our reconstruction of the primordial power spec- 
trum. 



5 DISCUSSION 

In this work, we have performed a minimally paramet- 
ric reconstruction of the primordial power spectrum, us- 
ing Lyman-a data. This is an extension of[McDonald ct al. 
( |005b[ ); Viel & Haehnelt (20061, who used Lyman-a data to 
measure the amplitude and slope of the primordial power 
spectrum on small scales, assuming that it had a power-law 
shape. Using a highly prescriptive model to fit data, even if 
it is physically motivated, can hide systematic effects, which 
may bias the recovered parameters in a manner which is 
hard to detect unless the bias is extremely large. Further, 
it is vital to go beyond parameter estimation and test the 
underlying model of the primordial power spectrum. This 
can in principle be achieved with a minimally parametric 
reconstruction framework coupled with a scheme for avoid- 
ing over-fitting the data. 



Peiris & Verde (20101, who attempted such a recon- 



struction including Lyman-a data, assumed that the power 
spectrum could be well approximated by an amplitude, a 
power-law slope and its running across the scales probed by 
the Lyman-a forest. In their analysis this assumption was 
justified as the Lyman-a data was treated as a single point 
and combined with CMB and galaxy survey data to recon- 
struct the power spectrum over a wide range of scales. 

However, the only likelihood function available up to 
now contained a power-law assumption about the primor- 
dial power spectrum shape, making it impossible to treat 
the Lyman-a data in a fully minimally parametric manner. 
We remedy this, performing a large suite of numerical simu- 
lations to construct a new likelihood function. The primor- 
dial power spectra thus emulated have considerable freedom 
in their shapes, specified by cubic smoothing splines. This 
provides the first ingredient for a minimally-parametric re- 
construction scheme. 

The second ingredient, as mentioned above, is to avoid 
fitting the noise structure of the data with superfluous oscil- 
lations. To this end, our method uses cross-validation (CV) 
to reconstruct the level of freedom allowed by the data. CV 
is a statistical technique which quantifies the notion that a 
good fit should be predictive. Schematically, it is a method of 
jackknifing the data as a function of a "roughness" penalty. 
A small penalty thus allows considerable oscillatory struc- 
ture in the power spectrum shape, while a larger penalty 
specifies a smoother shape. This penalty term thus performs 
the same function as a prior on the smoothness of the power 
spectrum. Jackknifing the data then tests the predictivity 
of the smoothing prior, choosing as the optimal penalty the 
one that maximizes predictivity. For technical details see 
Section O 



For the Lyman-a current data from SDSS (McDonald 
et al.|2006 1 , C V yields no significant preference for any par- 



ticular penalty. In the context of CV, this indicates that no 
penalty is more predictive or favoured over any other; in 



© 2010 RAS, MNRAS 000,[l}{T4] 



10 S. Bird et al. 



BOSS Simulation 



CT2 




0.4 0.6 1.0 2.0 3.0 



k (MpcT 1 ) 

Figure 5. Constraints on the power spectrum for simulated BOSS data. Black circles show the positions of the knots, normalized to 
match the input power spectra (black line). The orange region shows the top 68% of likelihoods from BOSS-quality Lyman-a data. The 
grey region shows an extrapolation of the Icr results from WMAP data to these scales, and the grey dashed line shows its lower extent. 




Figure 6. Joint 2D posterior constraints on the thermal history using forecast BOSS data. Input parameters are marked by black dots. 
Contours are drawn at 68 and 95 percent CL. See Section |2.3| for definitions of the thermal parameters. 



other words, the data are not sufficiently powerful to accu- 
rately reconstruct the strength of the shape prior. 

The minimally parametric method thus provides no evi- 
dence for features in the power spectrum in the current data, 
and our results are fully consistent with a scale-invariant 
power spectrum. The best-fitting amplitude of the power 
spectrum is, as in previous work, slightly higher than that 
extrapolated from WMAP ( Komatsu et al.|2011 1. However, 
because the data do not contain sufficient statistical power 
to reconstruct the power spectrum shape, our error bars are 
extremely large. An analysis that uses different statistical 



techniques, such as Bayesian evidence ( Jeffreys|1961 1, could 
provide further insight, but is beyond the scope of this pa- 
per. 

In the not so distant future, the first data from a new 
Lyman-a survey, BOSS (Sc hlegel et al.|2009 l, will be made 
available. We simulate a flux power spectrum and covari- 
ance matrix for BOSS, with an 80 fold increase in statistical 
power over the current data. In this case we successfully 
reconstruct the power spectrum, using CV to find an opti- 
mal penalty. The parameters we extract using CV are com- 
pletely consistent with the inputs to the simulation, and the 
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Figure 7. Comparison of our constraints. Blue is from current data; orange is our BOSS forecast. The grey region shows part of a 
reconstruction using both the CMB data and galaxy clustering measured by SDSS l |Peiris fe~ Vcrdc 2010J. The black squares show two 
knots used in the earlier reconstruction, while the black error bar shows lcr constraints on power spectrum amplitude from parameter 
estimation ( |Viel et al.|2009] |. The dashed line shows the extrapolated WMAP best-fitting power spectrum. 



resulting constraints are comparable to those achieved by 



performing CV reconstruction using WMAP data (Verde & 



|Peiris|2008[ ). We verify that statistical error is the factor pre- 
venting us from finding an optimal penalty for current data 
by simulating a power spectrum identical to BOSS, but with 
wider error bars, again failing to find an optimal penalty. 

Finally, we show that adding plausible future data 
on the smallscale thermodynamics of the IGM to BOSS 
does not significantly improve constraints on the primordial 
power spectrum. The simulated BOSS data are sufficiently 
powerful on their own to break degeneracies between the 
IGM and cosmological parameters, and are limited by sta- 
tistical error rather than systematic uncertainty. 

We have not considered the impact of the information 
BOSS is expected to provide on the transverse flux power 
spectrum. This will probe larger scales than our current 
work, offering a longer baseline and thus better sensitivity to 
the overall shape of the power spectrum. However, applying 
the present technique to the improved data set would re- 
quire simulations probing much larger scales, hence greatly 
increasing the numerical requirements. 

Fig. 7] shows the con straints from BOSS in comparison 
to those IPeiris &: Verde| ( |2010[ ) obtained by reconstructing 



the power spectrum using the CMB and the matter power 
spectrum from SDSS. By combining BOSS data with other 
probes ( |Seljak et al.] j2005 , 2006]), such as galaxy clustering, 
the CMB, and the transverse flux power spectrum, we will 
be able to accurately reconstruct the shape of the power 
spectrum on scales of k = 0.001 — 3Mpc _1 , probing ten 
e-folds of inflation. 
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APPENDIX A: SIMULATED SPECTRA 

In this appendix we detail the procedure for extracting a 
spectrum from a simulation snapshot. First, we must find the 
velocity of each particle, including both peculiar velocities 
and the Hubble flow. The effect of peculiar velocities is to 
increase the flux power by around 10%. 

Next, we calculate the optical depth at wavelength A, 
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as defined by the line integral 



o\n H dl , 



(Al) 



where cr\ is the cross-section for the transition and uh is 
the number density of the neutral hydrogen. o\ is given by 
the rest cross-section multiplied by a broadening function 



a\ = a, 



x $ . 



(A2) 



We define the oscillator strength, f a , by the ratio between 
the cross-section of the Lyman-a transition and the cross 
section of the transition involving a free electron 



7rroA 



(A3) 



where A is the rest wavelength of the Lyman-a transition. 
The classical radius of the electron, ro is related to the 
Thompson cross-section, or by 



ro 



3<tt 



Hence 



(A4) 



(A5) 



To compute the broadening function, we neglect natural 
broadening from the intrinsic uncertainty in the energy lev- 
els of the hydrogen atom. Natural broadening is only impor- 
tant in the densest absorbers (damped Lyman-a systems), 
which our simulations lack the resolution to adequately re- 
solve. The effect of DLAs is included by a correction applied 
when calculating the likelihood, for which see section [2. 6. 2| 

Hence the only form of broadening present is Doppler 
broadening. In an absorber of temperature Th, mass mj, 
and velocity v, the probability of a particle having zero ve- 
locity relative to an incoming photon is 

<^p[-f], (A6) 



c 

Tnb 



where b ■ 



2kT H 



Hence, a wavelength bin at position k will suffer absorption 
from a HI absorber in a bin j as: 



-n.tr a A exp 



c 



VI,: 



(A7) 
(A8) 



Here A is the bin width, and a is the expansion factor. 

The flux in each bin is then simply T = e _T . Each 
spectra is smoothed with a simple 3 point boxcar average, 



following Viel et al. (20041, and the flux power spectrum 



from the simulation box is defined to be the average over a 
number of simulated spectra. 



APPENDIX B: CONVERGENCE CHECKS 

In this appendix, we detail the checks we have performed to 
ensure that our simulations are properly converged with re- 
spect to box size and particle number. We have usually been 
comparing the relative change in the flux power spectrum 
when changing a parameter, making strict convergence not 
essential. 



To check box-size convergence, we compared the flux 
Pf(&) for our fiducial simulation with a large box size simu- 
lation ("L") which had size (75, 500), and otherwise identical 
parameters to the fiducial simulation ( "F" ) . This isolates the 
effect of box size by having identical particle resolution to 
simulation F. To test convergence with respect to particle 
number, we used a high resolution simulation ("H"), with 
(60, 500). In order to isolate the change due to numerical 
effects, we did not rescale the mean optical depth for the 
plots in this section. 

The left hand plot of Fig. |B1| shows the change in the 
flux power spectrum with increased box size; simulation L 
divided by simulation F. The flux power spectrum is con- 
verged with respect to box-size; however, there is a system- 
atic increase on large scales. This is due to sample variance: 
the specific realisation of cosmic structure we are using is 
biased slightly low on the largest scales probed by the box. 
The larger box recovers the input power spectrum much bet- 
ter on these scales, because it contains far more modes, and 
hence shows an increase in power. Because we use the same 
realisation of structure for all our simulations, this effect will 
be constant, and is easily corrected for by altering the best- 
fitting power spectrum. Once this is done, convergence of 
the flux power spectrum is very good. 

The right hand plot of Fig. |B1| shows the change in the 
flux power spectrum with increased particle resolution. The 
effect is small, except on small scales or at high redshift. 
Achieving numerical convergence for the Lyman-a forest at 
high redshift is challenging, because most of the signal for 
the Lyman-a forest is coming from poorly resolved under- 
dense regions. In addition, current data at high redshifts is 
much more noisy than at low redshifts, due to a paucity 
of quasar spectra. Accordingly, we follow |Viel fc Haehnelt] 



( 2006 1 and do not use the three highest redshift bins at 



z = 4.2,4.0 and 3.8 in our analysis. 

Our initial base for the central power spectrum was the 
output of simulation H. We corrected for box size and reso- 



lution following the method proposed by McDonald ( 2003 1 



To correct for sample variance on the largest scales, we ran 
two additional simulations with box size 120Mpcft" 1 , and 
400 3 and 200 3 particles. The smaller simulation has identi- 
cal particle resolution to our fiducial simulations, and should 
thus be directly comparable to them. We then corrected our 
best-guess power spectrum by the ratio between the two, 



C s (k) = Pf(N p = 400) /Pf(N p = 200) . 



(Bl) 



The smallest scale bin was excluded from this correction as 
it was clearly being affected by poor resolution convergence. 

Because simulation H is nearly converged, we had to 
be careful when correcting for resolution; if the error in the 
correction is larger than the correction itself, accuracy is 
definitely not increased. Therefore, we ran two simulations, 
with box size of 24Mpc/i _1 ; Tl and T2. Tl has the same 
particle resolution as simulation H, and thus 200 3 particles. 
T2 has 400 3 , giving it much increased particle resolution. 
These simulations do not resolve the largest scales probed 
by the Lyman-a forest at all, but Fig. |B1| shows that these 
scales are not affected by poor resolution convergence. Sim- 
ulation H is corrected by 



C R {k) = PP/PF 



(B2) 



To avoid our correction itself being biased by a small box 
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Figure Bl. (Left) The change in the flux power spectrum due to increasing the box size at fixed particle resolution. Each green line 
shows the effect on a different redshift bin, from 4.2 to 2.0. The effect is generally around 2% (grey box), with a systematic increase on 
large scales, for which we correct (see text). (Right) The effect on the flux power spectrum due to increasing the particle number from 
2 X 400 3 to 2 X 500 3 . Each line shows the effect on a different redshift bin, from 4.2 to 2.0. The grey box show a variation of 4%. Redshift 
bins with greater variation are labelled. 



size, we ignore those bins on scales greater than a quarter 
of the box-size of 24Mpc/i -1 . We also ignore any correction 
for redshift bins where the correction for the smallest scale 
bin is less than the uncertainty in the simulations, which we 
take to be 1%, on the grounds that these are already fully 
converged. We are left with a slight increase in power on the 
smallest scales. 
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